Enabling scalable scientific workflow management in the Cloud

نویسندگان

  • Yong Zhao
  • Youfu Li
  • Ioan Raicu
  • Shiyong Lu
  • Wenhong Tian
  • Heng Liu
چکیده

Cloud computing is gaining tremendous momentum in both academia and industry. In this context, we define the term “Cloud Workflow” as the specification, execution and provenance tracking of large-scale scientific workflows, as well as the management of data and computing resources to support the execution of large-scale scientific workflows in the Cloud. In this paper, we first analyze the gap between these two complementary technologies, and what it means to bring Clouds and workflows together. Then, we present the key challenges in supporting Cloud workflows, and present our reference framework for scientific workflow management in the Cloud. Last we present our experience in integrating a scientific workflow management system Swift into the Cloud. We discuss the performance of cluster provisioning within the OpenNebula Cloud platform, the Eucalyptus Cloud platform and Amazon EC2, and we demonstrate the capability and efficiency of the integration using a NASA MODIS image processing workflow and the Montage image mosaic workflow. Note to Practitioners Scientific workflow management plays a very important role for scientific computing and application coordination, while Cloud computing offers scalability and resource on-demand. We devise autonomous methods to integrate scientific workflow management systems with Cloud platforms and also provision resources for large scale workflows, which can facilitate scientists to easily manage their workflows in the Cloud, and take advantage of large scale Cloud resources. There are a few integration options and many challenges in the process, and the experience we gain will help researchers in migrating their workflow management systems and workflow applications into the Cloud.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Clustering Approach to Scientific Workflow Scheduling on the Cloud with Deadline and Cost Constraints

One of the main features of High Throughput Computing systems is the availability of high power processing resources. Cloud Computing systems can offer these features through concepts like Pay-Per-Use and Quality of Service (QoS) over the Internet. Many applications in Cloud computing are represented by workflows. Quality of Service is one of the most important challenges in the context of sche...

متن کامل

Editorial : Scientific Workflows , Provenance and Their Applications

Scientific workflows play a crucial role in modern eScience [5] where many significant scientific discoveries are achieved through complex and distributed computations. For many scientists in the Life Sciences, in bioinformatics, geosciences, chemistry, physics, and numerous other domains, scientific workflows have become an enabling technology to formalize and automate complex and data intensi...

متن کامل

Multi-objective and Scalable Heuristic Algorithm for Workflow Task Scheduling in Utility Grids

 To use services transparently in a distributed environment, the Utility Grids develop a cyber-infrastructure. The parameters of the Quality of Service such as the allocation-cost and makespan have to be dealt with in order to schedule workflow application tasks in the Utility Grids. Optimization of both target parameters above is a challenge in a distributed environment and may conflict one an...

متن کامل

Integration of cloud-based services into distributed workflow systems: challenges and solutions

The paper introduces the challenges in modern workflow management in distributed environments spanning multiple cluster, grid and cloud systems. Recent developments in cloud computing infrastructures are presented and are referring how clouds can be incorporated into distributed workflow management, aside from local and grid systems considered so far. Several challenges concerning workflow defi...

متن کامل

Migrating Scientific Workflow Management Systems from the Grid to the Cloud

Cloud computing is an emerging computing paradigm that can offer unprecedented scalability and resources on demand, and is gaining significant adoption in the science community. At the same time, scientific workflow management systems provide essential support and functionality to scientific computing, such as management of data and task dependencies, job scheduling and execution, provenance tr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Future Generation Comp. Syst.

دوره 46  شماره 

صفحات  -

تاریخ انتشار 2015